Two-step TAG parsing revisited

نویسندگان

  • Peter Poller
  • Tilman Becker
چکیده

Based on the work in (Poller, 1994) and a minor assumption about a normal form for TAGs, we present a highly simplified version of the twostep parsing approach for TAGs which allows for a much easier analysis of run-time and space complexity. It also snggests how restrictions on the grammars might result in improvements in run-time complexity. The main advantage of a two-step parsing system shows in practical applications like Verbmobil (Bub et al., 1997) where the parser must look at multiple hypotheses supplied by a speech recognizer (encoded in a word hypotheses lattice) and filter out illicit hypotheses as early as possible. The first (context-free) step of our parser filters out some illicit hypotheses fast (O(n3 )); the constructed parsing matrix is then reused for the second step, the complete (O(n6 )) TAG parse.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiword Expression-Aware A$*$ TAG Parsing Revisited

A? algorithms enable efficient parsing within the context of large grammars and/or complex syntactic formalisms. Besides, it has been shown that promoting multiword expressions (MWEs) is a beneficial strategy in dealing with syntactic ambiguity. The state-of-the-art A? heuristic for promoting MWEs in tree-adjoining grammar (TAG) parsing has certain drawbacks: it is not monotonic and it composes...

متن کامل

PLCFRS Parsing Revisited: Restricting the Fan-Out to Two

Linear Context-Free Rewriting System (LCFRS) is an extension of Context-Free Grammar (CFG) in which a non-terminal can dominate more than a single continuous span of terminals. Probabilistic LCFRS have recently successfully been used for the direct data-driven parsing of discontinuous structures. In this paper we present a parser for binary PLCFRS of fan-out two, together with a novel monotonou...

متن کامل

Interfacing Sentential and Discourse TAG-based Grammars

Tree-Adjoining Grammars (TAG) have been used both for syntactic parsing, with sentential grammars, and for discourse parsing, with discourse grammars. But the modeling of discourse connectives (coordinate conjunctions, subordinate conjunctions , adverbs, etc.) in TAG-based formalisms for discourse differ from their modeling in sentential grammars. Because of this mismatch, an intermediate, not ...

متن کامل

Lambek Grammars, Tree Adjoining Grammars and Hyperedge Replacement Grammars

Two recent extension of the nonassociative Lambek calculus, the LambekGrishin calculus and the multimodal Lambek calculus, are shown to generate class of languages as tree adjoining grammars, using (tree generating) hyperedge replacement grammars as an intermediate step. As a consequence both extensions are mildly context-sensitive formalisms and benefit from polynomial parsing algorithms.

متن کامل

Antecedent Recovery: Experiments with a Trace Tagger

This paper explores the problem of finding non-local dependencies. First, we isolate a set of features useful for this task. Second, we develop both a two-step approach which combines a trace tagger with a state-of-the-art lexicalized parser and a one-step approach which finds nonlocal dependencies while parsing. We find that the former outperforms the latter because it makes better use of the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998